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Abstract. We consider edge detection as the problem of measuring and localizing changes 
of light intensity in the image. As discussed by Torre and Poggio (1984), edge detection, 
when defined in this way, is an ill-posed problem in the sense of Hadamard. 

Using standard regularization theory, we regularize the problem with a stabilizing 
functional that is a specific form of a Tikhonov stabilizer, following Reinsch (1967) and 
Schoenberg (1964). The regularized solution that arises is then the solution to a variational 
principle. In the case of exact data, one of the standard regularization methods (see Poggio 
and Torre, 1984) leads to cubic spline interpolation before differentiation. We show that in 
the case of regularly-spaced data this solution corresponds to a convolution filter—to be 
applied to the signal before differentiation—which is a cubic spline. In the case of non-exact 
data, which is the most interesting situation, we use another regularization method that leads 
to a different variational principle. We prove (1) that this variational principle leads to a 
convolution filter for the problem of one dimensional edge detection, (2) that the form of 
this filter is very similar to the gaussian filter, and (3) that the regularizing parameter X in 
the variational principle effectively controls the scale of the filter. 

Finally, we outline several issues arising from our solution to the edge detection 
problem: (1) the use of methods from regularizing theories for finding the optimal value 
of the regularizing parameter X; (2) the connection between these methods and the scale- 
space method for edge detection; (3) the relationship between our edge detector and 
other detectors, especially the Marr/Hi!dreth edge detector; (4) the extension of our one- 
dimensional solution to two-dimensional edge detection; and (5) the extension of our method 
to deal with differentiation of surface data (though the physical constraint underlying the 
form of the regulatizer is not valid in general for depth data); this issue is connected to the 
problem of interpolating and approximating depth data. 
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1. Introduction 


Edge detection does not have a precisely defined goal. The word “edge” itself, which refers 
to physical properties of objects, is somewhat of a misnomer. Several years of experience 
have shown that the ideal goal of detecting and locating physical edges in the surfaces 
being imaged is very difficult and still out of reach (for a review see Brady, 1982). Edge 
detection has come to be defined as the first step in this goal of detecting physical changes 
such as object boundaries—the operation of detecting and locating changes in intensity in 
the image. Other processes which operate on these measurements of intensity changes will 
then group boundaries and label and characterize them in terms of the properties of the 
3-D surfaces. 

Intended in this narrow sense, edge detection—this first step in processing the 
image—is mainly the process that measures, detects and localizes changes of intensity. 
Derivatives must be estimated correctly to label the critical points in the image intensity 
array, characterize their local properties (are they minima or maxima or saddle points?) 
and thus relate them to the underlying physical process (are they shadow edges or depth 
discontinuities?). As a consequence, several different derivatives of the image, possibly at 
different scales, may have to be estimated. 1 

In this sense, Torre and Poggio (1984) considered edge detection as a problem of 
numerical differentiation of images. The problem is not straightforward, and attempts over 
many years have proven its difficulties. Considered as a problem of numerical differentiation, 
edge detection turns out to be an ill-posed problem. As explained by Poggio and Torre 
(1984), mathematically ill-posed problems are problems where the solution either does not 
exist or is not unique or does not depend continuously on the data. 

Numericai differentiation is a (mildly) ill-posed problem because its solution does not 
depend continuously on the data. It is therefore natural to try to solve this problem by using 
regularization techniques developed in recent years for dealing with mathematically ill-posed 
problems. The problem can be regularized by the use of a wide class of filters (Torre and 
Poggio, 1984, section 2.4; see also Duda and Hart, 1973). In the following section we 
consider two specific regularizing operators, that in some sense are very natural. 


2. Regularizing Edge Detection 

To regularize an ill-posed problem and make it well-posed, one has to introduce generic 
constraints on the problem. In this way, one attempts to force the solution to lie in a subspace 
of the solution space, where it is well defined. The basic idea of regularization techniques 
is to restrict the space of acceptable solutions by choosing the function that minimizes an 
appropriate functional. Poggio and Torre consider in particular standard regularization 
theory based on quadratic variational principles. They list three main techniques for 
regularizing the ill-posed problem of finding 2 from the data y such that Az = y. They 
involve the choice of norms ||-|| (usually quadratic) and of a stabilizing functional \\Pz\\. 
The choice is dictated by mathematical considerations, and, most importantly, by a physical 
analysis of the generic constraints on the problem. Three main methods can then be applied 
(see Bertero, 1982): 

(1) Among 2 that satisfy ||P*|| < C, where C is a constant, find 2 that minimizes 

\\A z ~y\\, ( 1 ) 


1 A very similar problem arises in the characterization of surface properties—in particular their 
differential properties—from depth data. 
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Figure 1 Cubic Interpolating Spline Filter (L 4 function of Schoenberg) 


(2) Among z that satisfy ||Az - y|| < C , find * that minimizes 

ll^ll, (2) 

(3) Find 2 that minimizes 

\\Az — y|| 2 + X||Pz|| 2 , (3) 

where X is a regularization parameter. 

The first method consists of finding the function 2 that satisfies the constraint ||/ J z|| < C and 
best approximates the data. The second method computes the function z that is sufficiently 
close to the data (C depends on the estimated errors and is zero if the data are noiseless) 
and is most “regular”. In the third method, the regularization parameter X controls the 
compromise between the degree of regularization of the solution and its closeness to the 
data. 

In the case of edge detection considered as numerical differentiation, we want an 
approximation / to the intensity data y, at sample points x, that is well behaved under 
differentiation. Thus we consider an operator A which samples the function / on the lattice 
such that Af\ z . — f\ Xi for i = 1 ,..., N. 

The problem is then to find a suitable norm and a suitable stabilizing functional ||P/||. 
It is natural to chose for P the simplest form of Tikhonov’s stabilizing functionals (Tikhonov 
and Arsenin, 1976) with P — and the usual L 2 norm. For k — 2 this choice corresponds 

to a constraint of smoothness on the approximated intensity profile 2 , with \\Pf\\ = 0. 
Its physical justification is that the noiseless image has to be smooth in the sense that its 
derivatives must be bounded because the image is band-limited by the optics. Band-limited 
functions have bounded derivatives because /' < fiM, where M = sup F[u), ft is the 
cut-off frequency, and F(u) is the Fourier transform of f{x). Physically, the constraint of 
smoothness allows us to effectively eliminate the noise that creeps in, after, or during the 
sampling and transduction process and makes the operation of differentiation unstable. We 
stress that this is not the only stabilizing functional possible for this problem, although it is 
probably the simplest one. 

2.1. The second regularization method and interpolating cubic splines 

With this choice of P, the second regularization method is: among / such that /(x,) = y, 
find / that minimize / f" 2 dx. A theorem by Schoenberg (see Greville, 1969) shows that 
the solution to this problem is a cubic spline interpolating between the data points. The 
following result is a reformulation of results due to Schoenberg (1946, 1964): 

Theorem 1: The cubic spline function f interpolating evenly-spaced data points 
and minimizing / f"* dx can be obtained by convolving the data points with a 
cubic spline filter which corresponds to the L 4 function of Schoenberg. 

A plot of L 4 is given in Figure 1. Note that the size of the filter is fixed with respect to 
the sampling lattice. The filter is 0 at every pixel but the central one, where it is J. Thus L 4 
is an interpolating filter that does not perform any smoothing. 
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Figure 2 Filter R derived by regularization principles. 


2.2. The third regularization method, approximating splines and the edge detection 
filter 

The third regularization method leads to the following problem: Find the / that minimizes 

(Vi ~ f( x i)) 2 + x /(/"(z)) 2 ^ (4) 

i 

where X is the regularization parameter which can be found as described later. This problem 
was considered originally by Reinsch (1967) in the case of numerical differentiation and by 
Schoenberg (1964) for the problem of graduation. Both Schoenberg and Reinsch gave 
the solution in terms of approximating cubic splines. In addition, we prove that for most 
practical purposes, the approximating spline function can be obtained by convolving the 
data point t/, with the cubic spline convolution filter R shown in Figure 2 (see also Appendix 
1). We then have the following theorem: 

Theorem 2: The solution to equation (4)—the reqularized solution to the problem 
of numerical differentiation—in the case of inexact data, can be obtained by 
convolving the data with a convolution filter which is (a) a cubic spline, and (b) 
very similar to a gaussian. 

Although this result is especially significant in the context of edge detection where 
the search for an optimal filter and its justification has been a longstanding preoccupation, 
it is somewhat surprising that this result does not seem to have been widely appreciated 
in the numerical analysis literature. The exact assumptions under which Theorem 2 is 
valid are discussed in Appendix 1. First, the data must be given on a regular grid (as is 
the case for an image). Second, the image data must either go to zero at infinity or be 
periodic. Under these conditions, the filtering operation is space-invariant and linear (the 
Euler-Lagrange equations corresponding to the quadratic variational problem are linear). 
Thus the approximating spline can be obtained by a convolution operation. Note that the 
result that the regularizing operator corresponding to a quadratic variational principle is a 
convolution filter—for data on a regular grid and toroidal boundary conditions—is valid, in 
general, beyond the case of numerical differentiation. 

Theorem 3: Quadratic, Tikhonov type regularization principles are equivalent to 
convolving the data with a generalized spline filter, if the data are given on an 
regular lattice and the boundary conditions are appropriate. 

A generally interesting question is the physical correctness of the regularized solution. 
In the case considered in this paper the answer is simple and not very insightful: a necessary 
and sufficient condition for the regularized solution to be correct is that the true intensity 
distribution is a polynomial of order less than 4 between sampling points. This property can 
be derived directly from equation (3) of Appendix 1.2. 

3. Regularization parameter and comparison with the gaussian filter 

Figure 2 shows the filter R obtained by solving the variational principle equation (1) in 
Appendix 1.1. Its shape and size depends on the regularizing parameter X. Figure 3(a) 
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(a) (b) 


Figure 3 Filter R and first derivative for different values of regularizing parameter X. (a) 
X affects size of (the first derivative of) R but (b) does not appear to affect shape of 
R. (Amplitudes are normalized in (a); both amplitudes and widths are scaled linearly and 
independently in (b) for comparison.) 




Figure 4 Comparison of one-dimensional regularizing filter R with Gaussian: (a) zeroth, 
(b) first, and (c) second derivatives. 


shows the first derivative of the filter for different values of X. The continuous version of 
the filter, derived in Appendix 2, is practically indistinguishable from the discrete filter, as 
shown by numerical comparisons. 

It is rather intuitive that the smoothing parameter X controls the effective size of the 
filter. From our numerical work, it seems that X does not significantly affect the shape of 
the filter, but only its size, as shown in Figure 3(b). Changing X amounts to scaling the size 
of the filter up or down. If X is small, smoothness is unimportant, and the filter will tend to 
be an interpolating filter and therefore be similar to a S function. On the other hand, with 
a very large X, the main weight is on smoothness, and the filter will tend to be very large. 
The continuous form of the filter suggests that the role of X is indeed equivalent to the role 
of a for the gaussian (X ~ a 4 , as shown in Appendix 4). 

The regularization filter derived here appears to be quite similar to the Gaussian 
distribution. Graphs of the filter R, its first and second derivatives are shown with those 
of the Gaussian in Figure 4. Marr and Hildreth (1980; Hildreth, 1980) have argued that 
the Gaussian is an optimal smoothing filter for detection due to its localization properties 
in both the spatial and frequency domains. The fact that the Gaussian is quite similar to 
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the optimal filter derived here using regularization principles provides further mathematical 
justification for the use of a Gaussian-like filter in edge detection. 

If the boundary conditions are not periodic or natural, then the derivation of Reinsch 
(see Appendix 1.1) provides the correct Green function for those boundary conditions. 
In numerical experiments we have found that the Green function obtained in this way is 
space-invariant for all points but those close to the boundaries. 

4. Discussion 

Several questions and extensions suggest themselves in a natural way. Here we list some 
of them. 

4.1. Finding the Optimal X 

Regularization theory can give the optimal value of X if the errors on the smoothing criteria 
and the error of the approximations are known in advance, if the integral of /'* is less 
than E , and the sum of (y t - /(x t )) 2 is less than e, then X = e/E (see Bertero, 1981; 
Tikhonov, 1963; Tikhonov and Arsenin, 1977). Normally, however, errors on the data or 
on the smoothness conditions are not known in advance. Regularization theory provides 
several methods for finding the optimal smoothing parameter X under this circumstance. 
We want to indicate here two main methods: (1) Tikhonov’s method, for convolution type 
problems, as is the case here, and (2) the cross-validation method and the generalized 
cross-validation method (Wahba, 1980). 

We plan to evaluate these methods for finding the optimal X for edge detection and 
to test them on real images. An interesting issue that we are also planning to explore is 
the following: The basic idea of the generalized cross-validation method is to check the 
goodness of approximation for each value of X. In order to do that, one computes the 
approximation by using not all but only some of the data points. Thus, the data points that 
are not used for computing the approximations serve as the control points for the goodness 
of the approximation. If one computes the goodness of the fit at different X, one can then 
choose the optimal X. This idea has obvious connections with the use of fingerprints (Yuille 
and Poggio, 1983) for finding the natural scale of the filter (Witkin, 1983); this point is 
discussed next. 

4.2. Optimal X and natural scale 

The size of the filter with which to perform edge detection has always been an unresolved 
issue in computer vision. Our approach makes it clear that one expects, indeed, an optimal 
size of the filter associated with the optimal value of the smoothing parameter X. In more 
recent years, several scales of filtering have been used, partly as a way around this problem. 
Rosenfeld and Kak (1982), Marr and Poggio (1977), Marr and Hildreth (1980; Hildreth, 1980) 
have used several sizes of filters in order to perform edge detection. 

More recently, Witkin (1983) has suggested the use of scale-space filtering, essentially 
filtering across a continuum of scales, as the method by which to choose the optimal scale. 
Witkin suggested some heuristics for picking the natural scale of filtering. We believe that 
cross-validation-type methods may make more rigorous the idea of selecting an optimal 
filtering scale by selecting the optimal X value. We are planning to use fingerprints—the 
zero-crossings across scales of the Laplacian of the convolved image—to find the optimal 
X. 

For scale-space filtering to be maximally effective, the shape of the filter should be 
a gaussian (Yuille and Poggio, 1983a, 1983b; Witkin, 1983). The effect of changing X is 
essentially equivalent to changing the size of the filter, and furthermore, the underlying filter 
is very similar to a gaussian. Therefore, increasing the value of X is equivalent to filtering 
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Figure 5 Difference of two-dimensional Gaussians as approximations to (a) Laplacian of 
gaussian, (b) Laplacian of two-dimensional regularizing filter R 2 . The ratio between the 
scales (<r) of the two Gaussians is 7 . 


the signal with a larger and larger gaussian as required in scale-space filtering. Because 
of the many nice properties of gaussian convolution, this also suggests an efficient way of 
obtaining approximations at increasing X. 

4.3. Relation with other edge detectors 

Because of the close similarity of our cubic spline filter to a gaussian, the edge detector 
that we derive in this paper is very similar to edge detectors proposed previously. Marr 
and Poggio (1977) proposed the difference of two gaussians as an approximation to the 
second derivative of a gaussian. Marr and Hildreth (1980; Hildreth, 1980) have shown that 
the second derivative of a gaussian is, indeed, very close to the difference of gaussians. 
J. Canny’s filter (Canny, 1S83) is very close to the derivative of a gaussian, and Haralick’s 
cubic polynomial interpolant (Haralick, 1982) is again similar to Canny’s filter. 

Our derivation justifies the use of a gaussian or a filter very close to a gaussian as 
the best filter for edge detection. Regularization theory yields derivative-of-gaussian filters 
as the optimal filter in a simpler, more general, and, we believe, more rigorous way, than 
previous derivations. In particular, our result makes clear that the quasi-gaussian filter 
regularizes the ill-posed problem of numerical differentiation. The regularizing constraint 
here is that the norm of the derivatives in the noise-free image is small. 

It is interesting that we derive a filter very similar to Canny’s, based on simpler and 
more general principles that are not restricted to the optimal detection of step edges. It is 
also interesting to note that the second derivative of the regularization filter, like the second 
derivative of the Gaussian, can be approximated by a difference of Gaussians (although 
not as well). While the second derivative of the Gaussian is best approximated by a space 
constant ratio (the ratio of scales of the two Gaussians) 7 » 1.6 (Marr and Hildreth 1980; 
Hildreth, 1980), increasing the ratio to 7 « 4 results in a function which better fits the main 
(excitatory) lobe of the regularizing filter, as shown in Figure 5 . 

4.4. Extension to two dimensions 

Our approach can be extended to two dimensions in several ways. The most straightforward 
method involves the use of directional derivatives. First order directional derivatives are 
taken along several directions. Each one of them is one-dimensional and can be performed 
according to our one-dimensional edge detector scheme. Depending on the specific goal, 
one may then choose the direction that gives the maximum value of the derivative. This 
corresponds to the non-maximum supression scheme used by Canny (1983). It is also 
equivalent to taking the second directional derivative along the gradient and looking for its 
zeros (Torre and Poggio, 1984). 
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Figure 6 Cross section of (a) two-dimensional regularization filter li 2 and (b) its Laplacian. 


A second method requires directly formulating the regularization principle in two 
dimensions. Instead of equation (2), one would then have the problem of minimizing 

I>.'i - /(*i>)) 2 + X / / (/’ ifdx dy (5) 

i,3 

The main problem is the choice of the operator P. If we consider a Tikhonov stabilizer (see 
Poggio and Torre, 1984), then a choice for P that is smooth enough to allow the use of 
second derivatives of the regularized image is P = V 2 V. This choice is used in Appendix 
3 to derive the filter for the two-dimensional continuous case. The filter is shown in Figure 
6 . The choice of the derivative to be used on the filter is a separate , important issue that 
we do not address in this paper. Torre and Poggio (1984) discuss the properties of several 
two-dimensional differential operators, including the second directional derivative along the 
gradient. 

If P is chosen to be the quadratic variation or the square Laplacian, the resulting 
approximations, known as as thin plate splines (Wahba, 1980; Terzopoulos, 1984a), are 
not smooth enough for finding zeros of second derivatives of a function 2 , as implied by 
Terzopoulos (1984b). It may also be interesting to explore the filters resulting from a 
non-linear functional P. In the case of a non-linear P , one cannot use, in general, the 
standard results about uniqueness and other properties of the solution that are available for 
the quadratic case of Tikhonov’s stabilizers, because the functional is no longer convex. 

Clearly, formulations of this type are also relevant for the problem of surface interpola¬ 
tion and approximation in the sense of Grimson (1982) and Terzopoulos (1984a). In the 
case of sparse data, which they considered, the variational principle does not lead to a 
convolution filter, although it does lead to a standard Green function. On a regular grid it 
leads to a convolution filter similar to the gaussian. As a practical implication, evenly-spaced 
surface data (for example, laser range data) may be interpolated or approximated effectively 
by gaussian convolution. Hence, tasks which involve differentiating surface data, such 
as computing lines of curvature (Brady, Ponce, Yuille and Asada, 1984), could use the 
simpler convolution method to smooth the data. Since Reinsch’s method (see Appendix 1 . 1 ) 
can deal with boundary conditions different from periodic ones, the corresponding Green 
function can be used to prevent smoothing across depth discontinuities. 

The results of applying the Laplacian of the two-dimensional regularization filter and 
the Laplacian of a gaussian to an image are shown in Figure 7. As expected, due to the 
similarilty of the two filters, both edge detection operators yield similar results. 


“We are indebted to Demetri Terzopoulos for this remark 


7 



Poggio 


Vooihees, Yuille 



Figure 7 Comparison of Laplacian of two-dimensional regularization filter and Laplacian of 
Gaussian as Edge Detectors: (a) image /, (b) zero-crossings of V 2 /i 2 */, (c) zero-crossings 
of v 2 cy/. 
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Appendix 1: The regularization filter for discrete data in one dimension 

Below we present two methods for proving that the third regularization method described 
in the text yields a convolution filter in the case of evenly-spaced discrete data in one 
dimension with appropriate boundary conditions. Each derivation includes a method for 
computing the filter, which has the form of a cubic spline. 

1.1. The convolution version of Reinsch’s optimal filter 

Reinsch (1967) considered the general problem of interpolating and smoothing a sequence 
of (not necessarily evenly-spaced) data points with uncertainties. Given a sequence of 
points (yi) with uncertainties {&/,•), representing values of a function at points (x,), where 
i = 1 ,2 ,..., n and X\ < x 2 < • • • < x n , the problem of finding a smooth function which passes 
near each point (x t -, y t ) can be formulated as finding the function f(x) that minimizes the 
functional (to be compared with equation 4 in the text) 

U*W"C«** 1,1 

Using calculus of variations, a cubic spline function is shown to be the optimal function 
satisfying ( 1 ), 

f{ x ) ~ a » + 6,(x — x t ) -f c,(x — x,) 2 + di{x — x,) 3 , x,- < x < x t+l , (2) 

for t = 1 , 2 ,...,n- 1 , and formulas are presented for calculating from the data (x t ) and (y<) 
the coefficients (a,-), ( 6 t ), (c t ), and (<*,•). 

We specialize Reinsch’s derivation as follows: First, the uncertainty associated with 
each data point is assumed to be the same so that all Sy { — 1 . Equation ( 1 ) above thus 
reduces to equation (4) in the text. Second, the data are assumed to be uniformly spaced, 
so that x,+i - x,i — h for i — 1 , 2 ,...,n- 1 . Finally, unlike Reinsch’s derivation, we assume 
periodic boundary conditions, i.e., that the solution / is a periodic function over 3 ?, with 
/(x) = /(x ± kn) for integer k. 

With these specializations we show that each set of coefficients can be calculated by 
multipying the data points y by a constant coefficient matrix: 

a = Ay, b = By, c = Cy, d = Dy, 

where A , B, C, and D are circulant matrices representing the filter and its derivatives. 
Hence, the data can be optimally smoothed (and differentiated) by a convolution operation. 

Using calculus of variations, one obtains from ( 1 ) these conditions for the optimal 
function /(x): 

/(x,)+ - /(*,•)_ = 0 (3a) 

/'(*»)+ ~ /'(*«)- = 0 (36) 

/*(*<)+ - /"(x,)_ = 0 (3c) 

- n*i)- = -«/w - vi), m 

(using Sy { = l) and 

} m {x) = 0, Xi < X < x i+1 , (4) 

for t = l,.. .,n (with xi = x n+1 ), where 
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Equations (3-4) show that /(x) is a cubic spline, having the form of (2); hence, 


/(*.)+ = a, 

(5a) 

/'(*<)+ = b i 

(54) 

f"{xi)+ = 2c,- 

(5c) 

/'"(*.)+ = 8* 

(5d) 

and, since all x,- - x,-_i = h , 


f{ x i)- = a t—l + b i—ih + Cf'—i h? + 

(6a) 

/^(x,)— — 6,_! + 2c t _i/i + 3rf t _i^ 2 

(65) 

— 2c,_i + §di—\h 

(6c) 

r(x,)_ = 

(«<0 

Now substitution into (3) yields 


a, — a,_i — 6,_i h — c,-_ifo 2 — d^ih 3 = 0 

(7a) 

6,- — 6,_i — 2c,_i/i — 3rf,_i/i 2 = 0 

(76) 

2c, — 2c, _i — 6d,_i/i = 0 

(7c) 

6d,' - i = -£(/(xj) - i/,). 

(7d) 

Equations (7) can be manipulated to yield 


l a,-) — C{h d{h 

(8a) 

a, 4.1 — 2a, + a,_i = j(c,_i + 4c,- + c,-_|_i)/i 2 

(86) 

d,/i 3 = |(c,- +1 - c,-)/» 2 

(8c) 

(c*—i 2c,- -f- Ci+ijh — 2X" (2/* 

(8«0 


Using the notation that: 

I — the n x n identity matrix where 

ifi = fc, i = 

3 ’ \o, otherwise 

N — the n x n “next” matrix [n Jf *] where 

__.fl, if j = fc - 1 mod n, j = 1.n 

i,k ' \o, otherwise 
P = the n x n “previous” matrix = N T , 
we define the matrices 

Q = P-2I+N. 

For example, if n = 4, 



71 

0 

0 

OV 


(° 

1 

0 

ov 


(0 

0 

0 

1\ 

/ = 

0 

1 

0 

0 

p = 

[0 

0 

1 

0 

N = 

1 

0 

0 

0 


0 

0 

1 

0 


[° 

0 

0 

1 

0 

1 

0 

0 


VO 

0 

0 

1/ 


Kl 

0 

0 

0 7 


VO 

0 

1 

0 



/4/3 

1/3 

0 

1/3V 


f- 2 1 0 

1 \ 

T = 

1/3 

0 

4/3 

1/3 

1/3 

4/3 

0 

i/3 

1 

Q = 

1 -2 1 

0 1 -2 

0 

1 

1 

Vl/3 

0 

1/3 

4/3y 

' 

V 1 0 1 

-27 
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Defining the vectors a = (a t ) T , 6 = (6 t /i) T , c = (c./i 2 ) 7 , d = {dih ?) T , and y = (y t ) T , for 
i — l,.. n, equations (8) can be expressed as 


b = (N-I)(L-z-d 
Qa = Tc 
d=i(JV- /)£ 

= Kfe-a)- 

These can be simplified to 


where 


a = Ag, b — By, £ — C h d = Dy, 


(9o) 

(96) 

(9c) 

M 


( 10 ) 


c - Ste 2 + Urr‘0 pi.) 

A = I-$QC (116) 

D=$(N-I)C (11c) 

B = {N-I)A-C-D. (lid) 


Since /, N, and P are circulant matrices, and since the set of circulant matrices is 
closed under matrix addition, multiplication, and inversion, the resulting coefficient matrices 
A, B, C, and D are also circulant, representing filters R(x), R'(x), R"{x ), and R"'{x). The fact 
that the filters are derivatives of each other follow from (5); hence, since differentiation is a 
linear operator, the filter R", for example, can be used to both optimally smooth and twice 
differentiate the data. 


1.2 Another derivation of the regularization filter for discrete data in one dimension 

In this derivation we show that the third regularization method, described by equation (4) 
of the text, yields a convolution filter which is a cubic spline. Standard results from the 
calculus of variations guarantee that our solution has continuous second derivatives. 

Again, the problem is to minimize 

x /(/"(*)) 2 cfx + ~ S'*) 2 - (!) 

t 

We find the minimum by sending f(x) >-> f{x) + 6f(x) and setting the first variation of ( 1 ) to 
zero, 

X J /""(*) 6f(x)dx + ]P(/(x,) - y<) Sf{xi) = 0. (2) 

t 

This yields the Euler-Lagrange equation 

X/""(x) + /(x) 6 ( x - x *) = yi ^ x ~ x *')- ( 3 ) 

x i 

So far, we have deliberately not specified boundary conditions. For infinite, or toroidal, 
boundary conditions, the function f[x) can be determined in terms of the (y.) by a convolution 
filter if and only if the data points (x,) are evenly spaced. This is because the system is 
then translation-invariant 2 . We will show this explicitly and give a method for constructing 
the convolution filter. 

The function in ( 1 ) is convex, and hence has a unique minimum, so there is a unique 
solution for f(x) in (3). Thus, we only need to see whether a convolution can solve (3). We 


2 Boundary conditions other than infinite or toroidal will destroy translation invariance. 
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try a solution 


f{x) = R(x) * y{x ) = J R(x- (4) 

where * denotes convolution, R(x) is a filter, and y(x) — ][Ty,- 6 (x — xA. We substitute ( 4 ) 
into (3) and obtain 

x yi R "'\ x - x i) + 5Z !C yj R i x i - Xj)s(x - Xi) - y<H x - *0 = o 

We compare coefficients of y t and obtain 

\R""(x - Xi) + ^2 R i x j ~ Xi)6(x - Xj) - S(x - x^ = 0 (6) 

3 

If the (x^ are not evenly-spaced, then these equations are inconsistent, and no convolution 
filter exists. If the (z<) are evenly spaced, then the set of equations in ( 6 ) reduces to a single 
equation 

\R""(x) + R( Xj )S(x - Xj) - 6{x) — 0 (7) 

3 

The solutions to (7) correspond to cubic splines “stitched” together at the points {x t }. 
Let Ri(x) denote the solution in the range z, < * < x t+1 . We write 

7? t (z) = a,z 3 + /? t x 2 + 7 ,x + Si ( 8 ) 

The splines are stitched together so that R(x) t R’{x), and R"{x) are continuous at the points 
{*«•}• From (7), we see that R"' has a discontinuity of ~{R{xi) at Xi . It is straightforward to 
find the relations between the tf t (x) and Ri-i{x) in terms of the parameters a,-,/?,-, 7 ,• and Si. 
This gives 


«n+l = £»n ~ R{nh) 

Pn+l = fin~ ^R(nh) 

(nh) 2 . 

ln+l = In - ^—R{nh) 

c c ( nh ) 3 

^n-fl — H- R(nh ) 

where h is the spacing between the lattice points, h = x f+1 - x,-, and 

R{nh) = a n (n/i) 3 + P n {nh) 2 + ln {nh ) + 6 n . 


(9) 


( 10 ) 
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Appendix 2: The regularization filter for continuous data in one dimension 
We want to find the function f(x) which minimizes the quadratic functional 

/(/(*) - y{x)fdx + \J {}"{x)fdx. (l) 

where y[x) are the data, given on the continuum. We will show that f(x) can be expressed 
as a convolution of the data y(x) with a suitable filter R(x). 

We obtain the Euler-Lagrange equations for (1) by setting the first variation to zero as 
f(x) f(x) + Sf(x). This gives 

J <*/(*) (/( x ) - y{x))dx + X J 6f{x)f""{x)dx — 0 (2) 

for all variations Sf(x) and hence we have 


X/""(*) + f[x) = y(x). 

The solution to the linear equation (3) is given by 

f[x) = J r(x, x') y(x , )dx 

where r(x, x') is the Green function of (3) obeying 


( 3 ) 

( 4 ) 


x -j-4 r { x ’ x ') + r ( x > x ') = s i x ~ *')• (5) 

Now (5) is a translation invariant equation and our boundary conditions are periodic or at 
infinity, so r(x, x') is a function of (x - x') only. We write 


r(x, x') = r(x - x ; ). (6) 

For each different value of X, we have a different equation (5) and hence a different 
Green’s function which we denote by R{x,\). Note from (5) that 

R{x, 0) = £(x). (7) 

So in the limit as the confidence in the data is complete, the filter becomes the delta 
function. (It is easy to see all this by taking the Fourier transform of equation 3.) 

It is straightforward to find the Green’s function of (5) which vanishes at plus and 
minus infinity. This is given by 


R(x, X) 


JL_ e -|.|/v'5x-/< co8 (_j£ 


2X 1 / 4 


Vv^X 1 / 4 


- 3 - 


In Fourier space, the transform of R{x, X) can be obtained directly from (5): 


- rrw 


so we can write 


R(x, X) = — /- 1 

V ' 2tt 7 1 -h 3 


+ Xw 4 


e~ tux du>. 


At X = 0 the Green function goes to a delta function, as we saw in (7). 
Define n by 

X/* 4 = 1. 

Then, for x > 0, we have 


co 8 ^ + sin^) 

2v/2 \ V2 V2J 


( 8 ) 

( 9 ) 

( 10 ) 


( 11 ) 

( 12 ) 
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and 


1 v2 


Thus the extrema of R(x, y) occur at 


vS 


x = —n7r, n — 1,2... 


and at these extrema, R takes the value 


R\ —= 1 )". 

1 M / 2s/2 


(13) 


(14) 


(15) 


So if y is small (X large), the extrema are at large x and correspond to small values of R. If 
y is large (X small), the nearest extrema occur at small x and correspond to large values of 
R. The function changes sign many times. 
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Appendix 3: The regularization filter for continuous data in two dimensions 

We can extend the results of the previous section to two dimensions. Our generalization of 
the smoothing function f(f"(x)) 2 dz is 

/ / v 2 V/(*)V 2 VJ(x)dx, ( 1 ) 

Courant and Hilbert (1953) show that the Euler-Lagrange equation is 

V 2 V 2 V 2 / = 0. (2) 

We write the regularization of the two-dimensional smoothing problem in the form 

//(/(*)- y(x))*dx 4 X J /(V 2 V/(s)V 2 V/($))<fe. (.'!) 

The Euler-Lagrange equations of the combined system are 

XV 2 V 2 V 2 /(x) + /(x) = y (x). (4) 

This equation is translation invariant and so, for boundary conditions at infinity or periodic, 
the solution can be written as a convolution 


f(x) = R 2 (x) * y(x) (5) 

where 

XV 2 V 2 V 2 /( 2 (s) + It-i(x) = %) ( 6 ) 

where 6(x) is the Dirac delta function. Again, observe that for small X the filter IUx) tends 
to the delta function. 

To solve ( 6 ) we take its Fourier Transform, and find 


This gives a solution 


7 — 


1 

Xw 6 + 1 ‘ 


( 7 ) 


We express x and u in polar coordinates 


e +iu-x 


Xu ; 6 -f 1 

^ = (r, <f), cv — ( w, 0 ). 


du>. 


So 


Now 


—in 


oo r 2 * o -f iu con( 0 —<t>) 


\w 6 + 1 


w dw dO. 


( 8 ) 

( 9 ) 

( 10 ) 


r‘2x 


where J a is the zero order Bessel function. This gives 


(H) 


“ 5 L 


J 0 {wr) 


w dwy 


lo Xw; ( ' + 1 

that is, lt 2 is the Hankel transform of l/(Xw 6 + l). This integral can be solved numerically 
The square Laplacian stabilizer leads to a similar formula, i.e. to 


( 12 ) 


*2 



J o(wr) 
\w l -f 1 


w dw, 


(13) 


The corresponding filter is not smooth enough to allow second derivatives to be taken, but 
is sufficient for first derivatives. The integral is found in standard texts (for example, 
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Gradshteyn and Ryzhik, 1980), yielding 

where kci(x) is a Thompson function given by 


, r u +2 2fc + ] , 

X \—\ I 


kci(x) — ( ln~ - C)bci(x) - jber(x) -f VV—l) fc — 

*=o 2‘ lfc + 2 [(2fc + l)!] 2 r “, rn 


;S. 


where 


oc 


"'W =E: 


(- 1 )**" 


A^lfc+2 


^ 2 «+*[( 2 fcH- I)!]*’ 


00 r_iU.-4* 

bcr(x) = V-- 

*=o 2 </b [(2/c)!] 2 

and C — 0.5772... is Euler's constant. The asymptotic behavior is 


7T 


:i i x )^\l 2i e 


where a and /? are constants. 


(M) 


(15) 


(16) 


(17) 
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Appendix 4: The regularization filter as an approximation to the Gaussian 

In this appendix we show that the regularization filter approximates the gaussian, both in 
one and two dimensions. 


4.1 Comparison of one-dimensional filters 


We show that the one-dimensional regularization filter is an approximate solution to the 
diffusion equation and therefore approximates a gaussian. 


The one-dimensional regularization filter in is given by 

We expand this in a Taylor series in to get 


ll(x, //,) 


2/2 



\ 2 
/ a* \ 

Hi 

-) +«»(,.+ 


(») 


( 2 ) 


This expansion is valid when fix is small, i.e., when x is small compared to X 1 /' 1 . In this case 
the first two terms which we denote by R(x,n) are a good approximation to the function. 
We calculate 


d*R _ - ii 3 
dx 2 2%/2 

and 


( 3 ) 


ak 

dfi 


~- + 0{nx) 2 , 
2/2 


which satisfy, to order {nx) 2 , the equation 


<Pk _ _ 3 dll 

dx 2 ^ dfi 

Thus this function obeys the diffusion equation, 


( 4 ) 


( 5 ) 


a 2 k _ dk 

dx 2 ~ ~ot 

with parameter t = \n~ 2 = ^X 1 / 2 , in the region where fix is small. As n decreases, this 
region gets larger and the region in which the function approximates a Gaussian increases. 

This theoretical analysis supports the numerical results (for discrete data) which show 
that R can be approximated by a Gaussian. Furthermore, recalling that the standard 
deviation a of the Gaussian is given by a = /2f, the analysis shows that the standard 
deviation of the corresponding Gaussian is X 1 / 1 . 

A comparison of R with the gaussian G = (e~ 22 / 2ff2 ) can also be done directly in the 
Fourier domain, where 7R — 1/(1 + Xw 4 ) and TG = e-* 2 " 2 , as shown in Figure 8. 

4.2 Comparison of two-dimensional filters 

We now consider the two-dimensional case. The regularized filter can be written in terms 
of a Fourier integral 


R 2 {x,\) 


1 f e~*—- 
2 nJ 1 + Xuj«'-* 


( 7 ) 
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(b) 



Figure 8 Comparison of one-dimensional regularizing filter R with gaussian in Fourier 
domain: (a) zeroth, (b) first, and (c) second derivatives. 


We perform the transformation & h* X 1 / 6 ^ to obtain, with p — 

fx r e -iu-t±Z 

ft2fefi)== ^y tt; (8) 

We expand the exponential in a power series 

[i f i 

R, &> t±) = —- J rw^ 1+ *-' + ■ ^-) 2+ °fe 3 £ 3 ))^- ( 9 ) 

Thus we have 



//:c) 2 + 0(^ 3 x 3 ))fiu;. 


( 10 ) 


Note that the linear term drops out due to asymmetry of the integrand. Keeping the first 
two terms on the right hand side of ( 10 ), and denoting this approximation by tf 2 ($,£). we 
calculate 


and 




7T 


/ 


u 2 doj 
1 + w 6 


( 11 ) 


art l f l , 

it’s/rf ? 4 (]2) 

Thus as before, the approximation ie 2 (a;, X) satisfies the diffusion equation with £ proportional 
to X 1 / 2 . The exact function of proportionality can be calculated from (12). 

Again, a direct comparison of the Gaussian with our regularizing filter R 2 is done easily 
in the Fourier plane. Both filters are circularly symmetric and therefore depend only on 
the radial frequency w. A comparison in Fourier space of the two-dimensional gaussian 
JC 2 = and the regularizing filter Ill 2 — l/(\w G -i- I) is shown in Figure 9. 
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(a) (b) 


Figure 9 Comparison of two-dimensional filter It 2 with two-dimensional gaussian C 2 in 
Fourier plane (a) filters and (b) their Laplacians. 


19 


Poggio 


Voorhees, Yuille 


6. References 

Bertero, M. “Problemi lineari non ben posti e metodi di regolarizzazione,” Problem non ben 
posti ed inversi, Istituto di Analisi Globale, Firenze, 1982. 

Brady, J. M. “Computational approaches to image understanding,” Computing Surveys , 14, 
3-71, 1982. 

Brady, J. M., Ponce, J., Yuille, A. L. and Asada, H. “Describing surfaces,” Proc. 2nd 
International Symposium, Robotics Research , Harafusa, H. and Inoue, H. (eds), MIT 
Press, 1985. 

Canny, J. F. “Finding edges and lines in images,” AI-TR-720 , MIT Al Lab, June 1983. 

Courant, R. and Hilbert, D. Methods of Mathematical Physics , vol. 1., Interscience Publishers, 
Inc., New York, 1953. 

Duda, Richard O., and Hart, Peter E. Pattern Classification and Scene Analysis, John Wiley 
and Sons, New York, 1973. 

Gradshteyn, I.S. and Ryzhik, I.M. Table of Integrals, Series and Products, Academic Press, 
1980. 

Greville, T. N. E. (ed). “Introduction to spline functions,” Theory and Applications of Spline 
Functions, Academic Press, New York, 1-36, 1969. 

Grimson, W. E. L. “A computational theory of visual surface interpolation,” Phil. Trans. R. 
Soc. bond., B, 298, 395-427, 1982. 

Grimson, W. E. L. From Images to Surfaces, MIT Press, 1981. 

Grimson, W. E. L. and Hildreth, E. “Comments on ‘Digital step edges from zero crossings 
of second directional derivatives’,” IEEE Trans. PAMI, 7, 121-127, 1985. 

Haralick, R. M. “Edge and region analysis for digital image data,” Comp. Graphics and 
Image Proc., 12, 60-73, 1980. 

Haralick, R. M. “The digital edge,” Proc. 1981 Conf. on Pattern Recognition and Image 
Processing, Dallas, Texas, 285-294, 1981. 

Haralick, R. M. “Zero-crossings of second directional derivative edge operator,” SPIE Proc. 
on Robot Vision, Arlington, Va., 1982. 

Hildreth, E. C. “Implementation of a theory of edge detection,” AI-TR-579, MIT Al Lab, 1980. 

Lunscher, W. H. H. “The asymptotic optimal frequency domain filter for edge detection,” 
IEEE Trans. PAMI, 6, 678-680, 1983. 

Marr, D. C. and Hildreth, E. C. “Theory of edge detection,” Proc. R. Soc. bond. B , 207, 
187-217, 1980. 

Marr, D. and Poggio, T. “A computational theory of human stereo vision,” Proc. R. Soc. 
bond. B, 204, 301-328, 1979 (also Al Memo 451, MIT Al Lab, 1977). 

Poggio, T. and Torre, V. “Ill-posed problems and regularization analysis in early vision,” Al 
Memo 773, MIT Al Lab, 1984. 

Reinsch, C. H. "Smoothing by spline functions”, Numer. Math., 10, 177-183, 1967. 

Rosenfeld, A. and Kak, A. C. Digital Picture Processing, 2nd ed., Academic Press, New York, 
1982. 

Schoenberg, I. J. “Contributions to the problem of approximation of equidistant data by 
analytic functions,” Quart. Appl. Math., 4, 45-99, 112-141, 1946. 

Schoenberg, I. J. “Spline functions and the problem of graduation,” Proc. Nat. Acad. Sci. 
USA, 52, 947-950, 1964. 

Shanmugam, K. F., Dickey, F. M. and Green, J. A. “An optimal frequency domain filter for 
edge detection in digital pictures,” IEEE Trans. PAMI, 1, 37-49, 1979. 


20 




Poggio 


Voorhees, Yuille 


Terzopoulos, D. “Multiresolution computation of visible-surface representations,” Ph.D. 
Thesis, Dept, of EECS, MIT, 1984a. 

Terzopoulos. D. “Controlled Smoothness Stabilizers for Ill-Posed Visual Problems Involving 
Discontinuities" Proc. Image Understanding Workshop, SAIC, 1984b. 

Tikhonov, A. N. “Solution of incorrectly formulated problems and the regularization method,” 
Soviet Math. Dokl., 4, 1035-1038, 1963. 

Tikhonov, A. N. and Arsenin, V. Y. Solutions of ill-posed problems , Winston and Sons, 
Washington, D.C., 1977. 

Torre, V. and Poggio, T. “On edge detection,” Al Memo 768, MIT Al Lab, 1984. 

Wahba, G. “Ill-posed problems: numerical and statistical methods for mildly, moderately, and 
severely ill-posed problems with noisy data,” Tech. Report 595, Univ. of Wisconsin, 
Madison, 1980. 

Witkin, A. “Scale-Space Filtering,” Proc. IJCAI , 1019-1021, Karlsruhe, 1983. 

Yuille, A. L. and Poggio, T. “Scaling Theorems for Zero-crossings,” Al Memo 722, MIT Al 
Lab, 1983a. 

Yuille, A. L. and Poggio, T. “Fingerprints Theorems for Zero-crossings,” Al Memo 730, MIT 
Al Lab, 1983b. ' 


21 



